Performance Functions and Reinforcement Learning for Trading Systems and Portfolios
نویسندگان
چکیده
We propose to train trading systems and portfolios by optimizing objective functions that directly measure trading and investment performance. Rather than basing a trading system on forecasts or training via a supervised learning algorithm using labelled trading data, we train our systems using recurrent reinforcement learning (RRL) algorithms. The performance functions that we consider for reinforcement learning are profit or wealth, economic utility, the Sharpe ratio and our proposed differential Sharpe ratio. The trading and portfolio management systems require prior decisions as input in order to properly take into account the effects of transactions costs, market impact and taxes. This temporal dependence on system state requires the use of reinforcement versions of standard recurrent learning algorithms. We present empirical results in controlled experiments that demonstrate the efficacy of some of our methods for optimizing trading systems and portfolios. For a long/short trader, we find that maximizing the differential Sharpe ratio yields more consistent results than maximizing profits, and that both methods outperform a trading system based on forecasts that minimize MSE. We find that portfolio traders trained to maximize the differential Sharpe ratio achieve better risk-adjusted returns than those trained to maximize profit. Finally, we provide simulation results for an S&P 500 / TBill asset allocation system that demonstrate the presence of out-of-sample predictability in the monthly S&P 500 stock index for the 25 year period 1970 through 1994.
منابع مشابه
Reinforcement Learning for Trading Systems and Portfolios
We propose to train trading systems by optimizing financial objective functions via reinforcement learning. The performance functions that we consider as value functions are profit or wealth, the Sharpe ratio and our recently proposed differential Sharpe ratio for online learning. In Moody & Wu (1997), we presented empirical results in controlled experiments that demonstrated the advantages of ...
متن کاملLearning to trade via direct reinforcement
We present methods for optimizing portfolios, asset allocations, and trading systems based on direct reinforcement (DR). In this approach, investment decision-making is viewed as a stochastic control problem, and strategies are discovered directly. We present an adaptive algorithm called recurrent reinforcement learning (RRL) for discovering investment policies. The need to build forecasting mo...
متن کاملReinforcement Learning for Trading
We propose to train trading systems by optimizing financial objective functions via reinforcement learning. The performance functions that we consider are profit or wealth, the Sharpe ratio and our recently proposed differential Sharpe ratio for online learning. In Moody & Wu (1997), we presented empirical results that demonstrate the advantages of reinforcement learning relative to supervised ...
متن کاملReinforcement Learning Based PID Control of Wind Energy Conversion Systems
In this paper an adaptive PID controller for Wind Energy Conversion Systems (WECS) has been developed. Theadaptation technique applied to this controller is based on Reinforcement Learning (RL) theory. Nonlinearcharacteristics of wind variations as plant input, wind turbine structure and generator operational behaviordemand for high quality adaptive controller to ensure both robust stability an...
متن کاملA Multi-agent Q-learning Framework for Optimizing Stock Trading Systems
This paper presents a reinforcement learning framework for stock trading systems. Trading system parameters are optimized by Qlearning algorithm and neural networks are adopted for value approximation. In this framework, cooperative multiple agents are used to efficiently integrate global trend prediction and local trading strategy for obtaining better trading performance. Agents communicate wi...
متن کامل